14 research outputs found

    Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation

    Full text link
    Learning-based approaches to robotic manipulation are limited by the scalability of data collection and accessibility of labels. In this paper, we present a multi-task domain adaptation framework for instance grasping in cluttered scenes by utilizing simulated robot experiments. Our neural network takes monocular RGB images and the instance segmentation mask of a specified target object as inputs, and predicts the probability of successfully grasping the specified object for each candidate motor command. The proposed transfer learning framework trains a model for instance grasping in simulation and uses a domain-adversarial loss to transfer the trained model to real robots using indiscriminate grasping data, which is available both in simulation and the real world. We evaluate our model in real-world robot experiments, comparing it with alternative model architectures as well as an indiscriminate grasping baseline.Comment: ICRA 201

    Using Computer Vision To Label And Search A Physical Space

    Get PDF
    Effective operation of a warehouse requires keeping track of the location of various assets within the physical environment. As various sensors are carried through the warehouse environment by operators, range data collected by the sensors over time can be used to reconstruct 2D and 3D representations of the space. This disclosure describes techniques to estimate the locations of Point-Of Interest (POIs) and Regions-Of-Interest (ROIs) within a physical environment such as a warehouse. The location estimates are generated using a combination of 2D visual search of images containing text labels and barcodes, 2D/3D environment reconstruction using sensor data, and estimated trajectory of sensors. Computer vision techniques are applied to visual data which is obtained from operational processes that generate images, such as feeds from stationary cameras, images from moving cameras, photos of the environment, etc

    Online learning of patch perspective rectification for efficient object detection

    Get PDF
    For a large class of applications, there is time to train the system. In this paper, we propose a learning-based approach to patch perspective rectification, and show that it is both faster and more reliable than state-of-the-art ad hoc affine region detection methods. Our method performs in three steps. First, a classifier provides for every keypoint not only its identity, but also a first estimate of its transformation. This estimate allows carrying out, in the second step, an accurate perspective rectification using linear predictors. We show that both the classifier and the linear predictors can be trained online, which makes the approach convenient. The last step is a fast verification –made possible by the accurate perspective rectification – of the patch identity and its sub-pixel precision position estimation. We test our approach on real-time 3D object detection and tracking applications. We show that we can use the estimated perspective rectifications to determine the object pose and as a result, we need much fewer correspondences to obtain a precise pose estimation. 1

    Dominant Orientation Templates for Real-Time Detection of Texture-Less Objects

    No full text
    We present a method for real-time 3D object detection that does not require a time consuming training stage, and can handle untextured objects. At its core, is a novel template representation that is designed to be robust to small image transformations. This robustness based on dominant gradient orientations lets us test only a small subset of all possible pixel locations when parsing the image, and to represent a 3D object with a limited set of templates. We show that together with a binary representation that makes evaluation very fast and a branch-and-bound approach to efficiently scan the image, it can detect untextured objects in complex situations and provide their 3D pose in real-time. 1

    Learning Real-Time Perspective Patch Rectification

    Get PDF
    We propose two learning-based methods to patch rectification that are faster and more reliable than state-ofthe-art affine region detection methods. Given a reference view of a patch, they can quickly recognize it in new views and accurately estimate the homography between the reference view and the new view. Our methods are more memoryconsuming than affine region detectors, and are in practice currently limited to a few ten patches. However, if the reference image is a fronto-parallel view and the internal parameters known, one single patch is often enough to precisely estimate an object pose. As a result, we can deal in real-time with objects that are significantly less textured than the ones required by state-of-the-art methods. The first method favors fast run-time performance while the second one is designed for fast real-time learning and robustness, however they follow the same general approach: First, a classifier provides for every keypoint a first estimate of its transformation. Then, the estimate allows carrying out an accurate perspective rectification using linear predictors. The last step is a fast verification—made possible by the accurate perspective rectification—of the patch identity and its sub-pixel precision position estimation. We demonstrate th
    corecore